204 research outputs found
YOLO-BEV: Generating Bird's-Eye View in the Same Way as 2D Object Detection
Vehicle perception systems strive to achieve comprehensive and rapid visual
interpretation of their surroundings for improved safety and navigation. We
introduce YOLO-BEV, an efficient framework that harnesses a unique surrounding
cameras setup to generate a 2D bird's-eye view of the vehicular environment. By
strategically positioning eight cameras, each at a 45-degree interval, our
system captures and integrates imagery into a coherent 3x3 grid format, leaving
the center blank, providing an enriched spatial representation that facilitates
efficient processing. In our approach, we employ YOLO's detection mechanism,
favoring its inherent advantages of swift response and compact model structure.
Instead of leveraging the conventional YOLO detection head, we augment it with
a custom-designed detection head, translating the panoramically captured data
into a unified bird's-eye view map of ego car. Preliminary results validate the
feasibility of YOLO-BEV in real-time vehicular perception tasks. With its
streamlined architecture and potential for rapid deployment due to minimized
parameters, YOLO-BEV poses as a promising tool that may reshape future
perspectives in autonomous driving systems
Federated Learning Framework Coping with Hierarchical Heterogeneity in Cooperative ITS
In this paper, we introduce a federated learning framework coping with
Hierarchical Heterogeneity (H2-Fed), which can notably enhance the conventional
pre-trained deep learning model. The framework exploits data from connected
public traffic agents in vehicular networks without affecting user data
privacy. By coordinating existing traffic infrastructure, including roadside
units and road traffic clouds, the model parameters are efficiently
disseminated by vehicular communications and hierarchically aggregated.
Considering the individual heterogeneity of data distribution, computational
and communication capabilities across traffic agents and roadside units, we
employ a novel method that addresses the heterogeneity of different aggregation
layers of the framework architecture, i.e., aggregation in layers of roadside
units and cloud. The experiment results indicate that our method can well
balance the learning accuracy and stability according to the knowledge of
heterogeneity in current communication networks. Compared to other baseline
approaches, the evaluation on a Non-IID MNIST dataset shows that our framework
is more general and capable especially in application scenarios with low
communication quality. Even when 90% of the agents are timely disconnected, the
pre-trained deep learning model can still be forced to converge stably, and its
accuracy can be enhanced from 68% to over 90% after convergence
Skipped Feature Pyramid Network with Grid Anchor for Object Detection
CNN-based object detection methods have achieved significant progress in
recent years. The classic structures of CNNs produce pyramid-like feature maps
due to the pooling or other re-scale operations. The feature maps in different
levels of the feature pyramid are used to detect objects with different scales.
For more accurate object detection, the highest-level feature, which has the
lowest resolution and contains the strongest semantics, is up-scaled and
connected with the lower-level features to enhance the semantics in the
lower-level features. However, the classic mode of feature connection combines
the feature of lower-level with all the features above it, which may result in
semantics degradation. In this paper, we propose a skipped connection to obtain
stronger semantics at each level of the feature pyramid. In our method, the
lower-level feature only connects with the feature at the highest level, making
it more reasonable that each level is responsible for detecting objects with
fixed scales. In addition, we simplify the generation of anchor for bounding
box regression, which can further improve the accuracy of object detection. The
experiments on the MS COCO and Wider Face demonstrate that our method
outperforms the state-of-the-art methods
ResFed: Communication Efficient Federated Learning by Transmitting Deep Compressed Residuals
Federated learning enables cooperative training among massively distributed
clients by sharing their learned local model parameters. However, with
increasing model size, deploying federated learning requires a large
communication bandwidth, which limits its deployment in wireless networks. To
address this bottleneck, we introduce a residual-based federated learning
framework (ResFed), where residuals rather than model parameters are
transmitted in communication networks for training. In particular, we integrate
two pairs of shared predictors for the model prediction in both
server-to-client and client-to-server communication. By employing a common
prediction rule, both locally and globally updated models are always fully
recoverable in clients and the server. We highlight that the residuals only
indicate the quasi-update of a model in a single inter-round, and hence contain
more dense information and have a lower entropy than the model, comparing to
model weights and gradients. Based on this property, we further conduct lossy
compression of the residuals by sparsification and quantization and encode them
for efficient communication. The experimental evaluation shows that our ResFed
needs remarkably less communication costs and achieves better accuracy by
leveraging less sensitive residuals, compared to standard federated learning.
For instance, to train a 4.08 MB CNN model on CIFAR-10 with 10 clients under
non-independent and identically distributed (Non-IID) setting, our approach
achieves a compression ratio over 700X in each communication round with minimum
impact on the accuracy. To reach an accuracy of 70%, it saves around 99% of the
total communication volume from 587.61 Mb to 6.79 Mb in up-streaming and to
4.61 Mb in down-streaming on average for all clients
End-to-End Insulator String Defect Detection in a Complex Background Based on a Deep Learning Model
Normal power line insulators ensure the safe transmission of electricity. The defects of the insulator reduce the insulation, which may lead to the failure of power transmission systems. As unmanned aerial vehicles (UAVs) have developed rapidly, it is possible for workers to take and upload aerial images of insulators. Proposing a technology to detect insulator defects with high accuracy in a short time can be of great value. The existing methods suffer from complex backgrounds so that they have to locate and extract the insulators at first. Some of them make detection relative to some specific conditions such as angle, brightness, and object scale. This study aims to make end-to-end detections using aerial images of insulators, giving the locations of insulators and defects at the same time while overcoming the disadvantages mentioned above. A DEtection TRansformer (DETR) having an encoder–decoder architecture adopts convolutional neural network (CNN) as the backbone network, applies a self-attention mechanism for computing, and utilizes object queries instead of a hand-crafted process to give the direct predictions. We modified this for insulator detection in complex aerial images. Based on the dataset we constructed, our model can get 97.97 in mean average precision when setting the threshold of intersection over union at 0.5, which is better than Cascade R-CNN and YOLOv5. The inference speed of our model can reach 25 frames per second, which is qualified for actual use. Experimental results demonstrate that our model meets the robustness and accuracy requirements for insulator defect detection
Efficient network-matrix architecture for general flow transport inspired by natural pinnate leaves
Networks embedded in three dimensional matrices are beneficial to deliver physical flows to the matrices. Leaf architectures, pervasive natural network-matrix architectures, endow leaves with high transpiration rates and low water pressure drops, providing inspiration for efficient network-matrix architectures. In this study, the network-matrix model for general flow transport inspired by natural pinnate leaves is investigated analytically. The results indicate that the optimal network structure inspired by natural pinnate leaves can greatly reduce the maximum potential drop and the total potential drop caused by the flow through the network while maximizing the total flow rate through the matrix. These results can be used to design efficient networks in network-matrix architectures for a variety of practical applications, such as tissue engineering, cell culture, photovoltaic devices and heat transfer
Full waveform inversion based on dynamic data matching of convolutional wavefields
Cycle skipping problem caused by the absent of low frequencies and inaccurate initial model makes full waveform inversion (FWI) deviate from the true model. A novel method is proposed to mitigate cycle skipping phenomenon by dynamic data matching which improves the matching of synthetic and observed events to regulate the updating of initial model in a correct direction. 1-dimentional (1-D) Gaussian convolutional kernels with different lengths are used to extract features of each time sample in each trace which represents the integrated properties of wavefield at different time ranges centered on each time sample. According to the minimum Euclidean distance of the features, the optimally matched pairs of time samples in the observed and synthetic trace can be found. A constraint evaluates the reliability of dynamic matching by attenuating the amplitude of synthetic data according to the values of traveltime differences between each pairs of optimally matched time samples is proposed to improve the accuracy of data matching. In addition, Gaussian kernels have the capability to extract features of time samples contaminated by strong noises accurately to improve the robustness of the propose method further. The selection scheme of optimal parameters is discussed and concluded to ensure the convergence of the proposed method. Numerical tests on Marmousi model verify the feasibility of the propose method. The proposed method provides a new approach to tackle the convergence problem of FWI when using the field seismic data
Effects of Flotage on Immersion Indentation Results of Bone Tissue: An Investigation by Finite Element Analysis
In reality, nanoindentation test is an efficient technique for probing the mechanical properties of biological tissue that soaked in the liquid media to keep the bioactivity. However, the effects of flotage imposed on the indenter will lead to inaccuracy when calculating mechanical properties (for instance, elastic modulus and hardness) by using depth-sensing nanoindentation. In this paper, the effects of flotage on the nanoindentation results of cortical bone were investigated by finite element analysis (FEA) simulation. Comparisons of nanoindentation simulation results of bone samples with and without being soaked in the liquid media were carried out. Conclusions show that the difference of load-displacement curves in the case of soaking sample and without soaking sample conditions varies widely based on the change of indentation depth. In other words, the nanoindentation measurements in liquid media will cause significant error in the calculated Young’s modules and hardness due to the flotage. By taking into account the effect of flotage, these errors are particularly important to the accurate biomechanics characterization of biological samples
- …